skip to main content


Search for: All records

Creators/Authors contains: "Estaki, Mehrbod"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Mackelprang, Rachel (Ed.)
    ABSTRACT Increasing data volumes on high-throughput sequencing instruments such as the NovaSeq 6000 leads to long computational bottlenecks for common metagenomics data preprocessing tasks such as adaptor and primer trimming and host removal. Here, we test whether faster recently developed computational tools (Fastp and Minimap2) can replace widely used choices (Atropos and Bowtie2), obtaining dramatic accelerations with additional sensitivity and minimal loss of specificity for these tasks. Furthermore, the taxonomic tables resulting from downstream processing provide biologically comparable results. However, we demonstrate that for taxonomic assignment, Bowtie2’s specificity is still required. We suggest that periodic reevaluation of pipeline components, together with improvements to standardized APIs to chain them together, will greatly enhance the efficiency of common bioinformatics tasks while also facilitating incorporation of further optimized steps running on GPUs, FPGAs, or other architectures. We also note that a detailed exploration of available algorithms and pipeline components is an important step that should be taken before optimization of less efficient algorithms on advanced or nonstandard hardware. IMPORTANCE In shotgun metagenomics studies that seek to relate changes in microbial DNA across samples, processing the data on a computer often takes longer than obtaining the data from the sequencing instrument. Recently developed software packages that perform individual steps in the pipeline of data processing in principle offer speed advantages, but in practice they may contain pitfalls that prevent their use, for example, they may make approximations that introduce unacceptable errors in the data. Here, we show that differences in choices of these components can speed up overall data processing by 5-fold or more on the same hardware while maintaining a high degree of correctness, greatly reducing the time taken to interpret results. This is an important step for using the data in clinical settings, where the time taken to obtain the results may be critical for guiding treatment. 
    more » « less
  2. Abstract

    Graves’ Disease is the most common organ-specific autoimmune disease and has been linked in small pilot studies to taxonomic markers within the gut microbiome. Important limitations of this work include small sample sizes and low-resolution taxonomic markers. Accordingly, we studied 162 gut microbiomes of mild and severe Graves’ disease (GD) patients and healthy controls. Taxonomic and functional analyses based on metagenome-assembled genomes (MAGs) and MAG-annotated genes, together with predicted metabolic functions and metabolite profiles, revealed a well-defined network of MAGs, genes and clinical indexes separating healthy from GD subjects. A supervised classification model identified a combination of biomarkers including microbial species, MAGs, genes and SNPs, with predictive power superior to models from any single biomarker type (AUC = 0.98). Global, cross-disease multi-cohort analysis of gut microbiomes revealed high specificity of these GD biomarkers, notably discriminating against Parkinson’s Disease, and suggesting that non-invasive stool-based diagnostics will be useful for these diseases.

     
    more » « less